Picture for Chunhua Shen

Chunhua Shen

The University of Adelaide

Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration?

Add code
May 31, 2026
Viaarxiv icon

Geo-Align: Video Generation Alignment via Metric Geometry Reward

Add code
May 22, 2026
Viaarxiv icon

MARBLE: Multi-Aspect Reward Balance for Diffusion RL

Add code
May 07, 2026
Viaarxiv icon

Unlocking the Power of Critical Factors for 3D Visual Geometry Estimation

Add code
Apr 23, 2026
Viaarxiv icon

MMControl: Unified Multi-Modal Control for Joint Audio-Video Generation

Add code
Apr 22, 2026
Viaarxiv icon

Exploring Spatial Intelligence from a Generative Perspective

Add code
Apr 22, 2026
Viaarxiv icon

OmniJigsaw: Enhancing Omni-Modal Reasoning via Modality-Orchestrated Reordering

Add code
Apr 09, 2026
Viaarxiv icon

TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders

Add code
Apr 08, 2026
Viaarxiv icon

Efficient Self-Evaluation for Diffusion Language Models via Sequence Regeneration

Add code
Mar 03, 2026
Viaarxiv icon

M$^2$: Dual-Memory Augmentation for Long-Horizon Web Agents via Trajectory Summarization and Insight Retrieval

Add code
Feb 28, 2026
Viaarxiv icon